Naive Bayes and Text Classification I - Introduction and Theory
نویسنده
چکیده
2 Naive Bayes Classification 3 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Posterior Probabilities . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3 Class-conditional Probabilities . . . . . . . . . . . . . . . . . . . 5 2.4 Prior Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.5 Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.6 Multinomial Naive Bayes A Toy Example . . . . . . . . . . . . 9 2.6.1 Maximum-Likelihood Estimates . . . . . . . . . . . . . . . 10 2.6.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.6.3 Additive Smoothing . . . . . . . . . . . . . . . . . . . . . 11
منابع مشابه
A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier
With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...
متن کاملIn silico prediction of anticancer peptides by TRAINER tool
Cancer is one of the causes of death in the world. Several treatment methods exist against cancer cells such as radiotherapy and chemotherapy. Since traditional methods have side effects on normal cells and are expensive, identification and developing a new method to cancer therapy is very important. Antimicrobial peptides, present in a wide variety of organisms, such as plants, amphibians and ...
متن کاملPoisson naive Bayes for text classification with feature weighting
In this paper, we investigate the use of multivariate Poisson model and feature weighting to learn naive Bayes text classifier. Our new naive Bayes text classification model assumes that a document is generated by a multivariate Poisson model while the previous works consider a document as a vector of binary term features based on the presence or absence of each term. We also explore the use of...
متن کاملBridging the Gap between Naive Bayes and Maximum Entropy Text Classification
Abstract. The naive Bayes and maximum entropy approaches to text classification are typically discussed as completely unrelated techniques. In this paper, however, we show that both approaches are simply two different ways of doing parameter estimation for a common log-linear model of class posteriors. In particular, we show how to map the solution given by maximum entropy into an optimal solut...
متن کاملAn Improved Naive Bayes Text Classification Algorithm In Chinese Information Processing
In Chinese information processing, Naive Bayes is a simple text classification method that is easily implemented. Its core is the realization of the calculating posterior probability algorithm and the effectively reducing dimension for feature words. This paper improved Naive Bayes text classification from the calculating posterior probability and the reducing dimension of feature words of text...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1410.5329 شماره
صفحات -
تاریخ انتشار 2014